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Abstract —A reliable support detection is essential for a 
greedy algorithm to reconstruct a sparse signal accurately from 
compressed and noisy measurements. This paper proposes a 
novel support detection method for greedy algorithms, which is 
referred to as ‘‘‘‘maximum a posteriori (MAP) support detection”. 
Unlike existing support detection methods that identify support 
indices with the largest correlation value in magnitude per 
iteration, the proposed method selects them with the largest 
likelihood ratios computed under the true and null support 
hypotheses by simultaneously exploiting the distributions of sens¬ 
ing matrix, sparse signal, and noise. Leveraging this technique, 
MAP-Matching Pursuit (MAP-MP) is first presented to show the 
advantages of exploiting the proposed support detection method, 
and a sufficient condition for perfect signal recovery is derived for 
the case when the sparse signal is binary. Subsequently, a set of 
iterative greedy algorithms, called MAP-generalized Orthogonal 
Matching Pursuit (MAP-gOMP), MAP-Compressive Sampling 
Matching Pursuit (MAP-CoSaMP), and MAP-Subspace Pursuit 
(MAP-SP) are presented to demonstrate the applicability of the 
proposed support detection method to existing greedy algorithms. 
From empirical results, it is shown that the proposed greedy 
algorithms with highly reliable support detection can be better, 
faster, and easier to implement than basis pursuit via linear 
programming. 

I. Introduction 

Compressive sensing (CS) is a technique to re¬ 

construct sparse signals from compressed measurements. CS 
has received great attention due to its broad application areas 
including imaging, radar, and communication systems 0 . 0 - 
The fundamental theory of CS guarantees to recover a high 
dimensional signal vector from linear measurements that are 
far fewer in number than the signal’s dimension, provided that 
the sparsity of the signal, i.e. number of nonzero elements, is 
smaller than a certain fraction of the number of measurements. 

Denoting the sparse signal vector and the compressive 
sensing matrix as x S and $ S respectively, 

with M < N, the optimal sparse recovery solution can be 
theoretically obtained by solving the fo-ntinimization problem 

min||x||o subject to y = $x. (1) 

In practice, however, solving this problem is NP-hard 0 and 
computationally unfeasible for large signal dimension (N). 

Design of computationally efficient sparse signal recovery 
algorithms have extensively studied in past works. Basis 
Pursuit (BP) 0-0 is a representative sparse signal recovery 
algorithm leveraging convex optimization. Relaxing the £q- 
minimization problem to a fi-minimization problem, it has 
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been shown that the sparse signal recovery problem can be 
solved with stability and uniform guarantees using linear 
programming, but with polynomially bounded computation 
complexity. For example, an interior point method that solves 
the £i-minimization problem has an associated computational 
complexity of 0. 

As a result, greedy algorithms are also popular because 
their complexity is lower than that of BP although stability 
and guarantees are challenging to prove fTO) , p7j-p9). The 
underlying idea of greedy algorithms is to estimate the nonzero 
elements of a sparse vector iteratively. Orthogonal matching 
pursuit (OMP) is a well-known greedy algorithm pO)- HD. 
which estimates the coordinate of the non-zero element in 
signal X that has the maximum absolute correlation between 
the column vector in the sensing matrix and the residual 
vector in each iteration. By subtracting the contribution from 
the measurement vector y, the algorithm updates the entire 
support of X in an iterative manner. Although this algorithm 
is simple to implement, it is vulnerable to error propagation 
effect |[^-|[g. This is because the OMP algorithm is not 
capable of removing incorrectly estimated supports once those 
are added to the support set during the iterations, which leads 
to significant performance degradation in the signal recovery. 

Several other advanced greedy algorithms have been pro¬ 
posed to overcome the error propagation effect, which include 
Stagewise Orthogonal Matching Pursuit (StOMP) jTS] , itera¬ 
tive hard thresholding (IHT) p^ , generalized OMP (gOMP) 
El’ Compressive Sampling Matching Pursuit (CoSaMP) El’ 
and Subspace Pursuit (SP) El- The underlying principle of 
these advanced greedy algorithms is the selection of mul¬ 
tiple support indices per iteration, leading to a decrease in 
the probability of estimating incorrect support elements. For 
example, in each iteration, StOMP E) identifies multiple 
support indices such that the correlation value in magnitude 
between the current residual vector and the corresponding 
column vector of $ exceeds a predefined threshold. Similarly, 
gOMP p7| chooses multiple supports that provide L largest 
correlation in magnitude per iteration, where L is a fixed 
parameter given in the algorithm. CoSaMP P8[ and SP El 
also identify multiple support indices per iteration, but differ 
from StOMP and gOMP in that they perform a two-stage 
sparse signal estimation approach that allows to add or remove 
new support candidates adaptively. A common shortcoming of 
these greedy algorithms pO) , E), El’ El’ El that they 
rely on the order statistics of the correlation value in magnitude 
for the support estimation. 

Depending on statistical distributions of sensing matrix, 
sparse signal, and noise, however, the selection of the index 
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with the largest correlation value may not be optimal in the 
sense of support detection probability. With this motivation, 
greedy algorithms called Bayesian matching pursuit were pro¬ 
posed in |[2^-||27|. The key idea of Bayesian matching pursuit 
is the use of distributions of the sparse signal and noise in the 
support detection step. For example, fast Bayesian matching 
pursuit (FBMP) p4) performs sparse signal estimation via 
model selection, assuming a Gaussian distribution for the 
sparse vector. Similarly, in |26|, assuming the elements 
of a sparse signal are Bernoulli-Gaussian mixed variables, and 
a given deterministic sensing matrix, the algorithms jointly 
update a support index and the corresponding signal element 
at each iteration in order to maximize the increase of a 
local likelihood function. Although these approaches show a 
better sparse recovery performance compared to conventional 
matching pursuit algorithms in the presence of noise, they are 
applicable to certain distributions of x like Bernoulli-Gaussian, 
and there are no provable performance guarantees. 

In this paper, we continue the same spirit of harnessing the 
statistical distributions of sparse signal, sensing matrix, and 
noise for the support detection in greedy algorithms. Our main 
contribution is to propose a novel support detection method 
for greedy algorithms, which is referred to as maximum a 
posteriori (MAP) support detection. The key difference with 
prior work in ||2^-p7| is that the proposed method estimates 
supports with the largest log-MAP ratio values computed 
under the true and null support hypotheses in each iteration by 
incorporating the distributions of the sensing matrix, the sparse 
signal, and noise jointly. Specihcally, assuming the sensing 
matrix has elements that are drawn from independent and 
identically distributed (IID) Gaussian random variables, and 
the sparse signal has non-zero elements that follow an arbitrary 
distribution, the proposed method selects the support element 
having the maximum log-MAP ratio instead of selecting 
indices that exceed a certain threshold as in l|24)-(^. By 
leveraging this technique, we hrst present a novel greedy 
algorithm named “MAP-Matching Pursuit (MAP-MP)” for the 
binary sparse signal reconstruction. Using this, it is shown that 
MAP-MP exactly recovers a AT-sparse binary signal within K 
number of iterations almost surely, provided that the number 
of measurement scales as 


M = 0{{K+~al)\og{N)), (2) 

where tf^ is the normalized noise variance. This condition 
extends the existing statistical guarantees proven in ng by 
incorporating a noise effect. Next, we extend our MAP- 
MP algorithm for the sparse signal with an arbitrary distri¬ 
bution using a moment matching technique. Subsequently, 
applying the proposed MAP support detection method, we 
propose a set of iterative greedy algorithms, called MAP- 
Orthogonal Matching Pursuit (MAP-OMP), MAP-generalized 
OMP (MAP-gOMP), MAP-Compressive Sampling Matching 
Pursuit (MAP-CoSaMP), and MAP-Subspace Pursuit (MAP- 
SP) to demonstrate the applicability of the proposed support 
detection method in improving the recovery performance of 
the existing algorithms. From the empirical results, it is shown 
that the proposed algorithms provide signihcant gains in the 
perfect recovery performance compared to that of the existing 


greedy algorithms as well as a £i-minimization algorithm via 
BP. 

II. Problem Statement 

We consider a sparse signal detection problem from com¬ 
pressed and noisy measurement. Let us denote a N dimen¬ 
sional input signal vector by x C We assume that the 

input vector is AT-sparse, i.e., ||x||o = K N and the 
sparsity level AT is known a priori. This prior information can 
be estimated accurately in some applications using the cross 
validation technique in p^ . We denote the true support set 
by T C W} and |T| = AT. The non-zero entries of 

X are distributed according to a continuous distribution, i.e., 
P(^) = Y\k^TPk(xk). Furthermore, we denote the sensing 
matrix consisting of N column vectors by G , 

$ = [ai,a2,...,aAf], (3) 

where a„ denotes the nth dictionary vector whose entries are 
drawn from an IID Gaussian random distribution with zero 
mean and variance i.e., Af (O, Then, the measurement 
equation is given by 

y = $x -f w, (4) 

where y G and w G are the measurement and noise 
vector, respectively. All entries of the noise vector are assumed 
to be IID Gaussian random variables with zero mean and 
variance Af (O, cr^). 

Throughout this paper, the difference between two sets T 
and S is denoted by T \ S. We use the subscript notations 
X|5 and 4>|5 to denote that vector x and matrix $ are being 
restricted to only elements or columns in set S. 

III. MAP-Matching Pursuit 

In this section, we hrst present MAP-MP, a binary sparse 
signal X G {0,1}^ recovery algorithm. Then, we derive a 
bound that provides a sufficient condition for perfect signal 
recovery to demonstrate provable performance guarantees of 
the proposed algorithm. 

A. Algorithm 

Similar to the other greedy algorithm fig, MAP-MP is a 
greedy algorithm that sequentially hnds support indices and 
estimates the signal representation within a certain number of 
iterations. The core difference between the proposed MAP- 
MP algorithm and the prior OMP-type algorithms lies in the 
selection rule of the support index per iteration. In contrast to 
the OMP-type greedy algorithms, MAP-MP chooses the index 
based on a maximum likelihood hypothesis test by leveraging 
statistical property of the sensing matrix and the sparse signal. 

We begin by providing Lemmas that are required for 
explaining the MAP-MP algorithm. Lemma provides the 
distribution of the inner product between two (atom) dictionary 
vectors generated by IID Gaussian random variable. Lemma 
yields the distribution of the 2-norm of each dictionary vector. 
Lemma in turn, provides an asymptotic behavior of the 2- 
norm of each dictionary vector when the measurement size M 
goes to inhnity. 
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Lemma 1. Suppose that all the elements of Sinfor n S [1 : iV] 
are drawn from IID Gaussian distribution with zero mean and 
variance Then, the distribution of Gaussian with 

zero mean and variance i.e., ~ A/” (O, ^). 

Proof: See Appendix [A| ■ 

Lemma 2. The distribution of the norm ||a „||2 is 
2^ 2 M 2 2 

a„||2(^)=" p 

— rii+Mi 

h rd) • 

Proof: See Appendix [B] ■ 

Lemma 3. The norm ||a „||2 of each dictionary vector for 
n G [1 : A^] concentrates to one asymptotically as M goes to 
infinity. 


lim P [|||a „||2 - 1| > e] = 0 (6) 

M—foo 

for some positive e > 0. 

Proof: See Appendix [C] ■ 

By leveraging these Lemmas, we explain the proposed 
algorithm. In the fcth iteration, the algorithm produces N corre¬ 
lation values ^ 2 : ■ • • j - 2 ^} by computing the inner product 
between the residual vector updated in the (/c —l)th 

iteration and the nth column vector a„, i.e., = -if—n— 

" l|an||2 

for n G [1 : A^]. Under the premise that the algorithm has 
perfectly found the elements of the true support, i.e., = 1 

for f G S^~^, the residual vector is 

^fc_i ^ X! (7) 

where C T and = k— 1. Then, the inner product 

k ePr'‘~^ 

value 11 ^ can be expressed as a linear combination 

of the remaining non-zero elements and their corresponding 
supports as follows: 


z'" = 


aexe 
i,rGr\5'=-i 


= a„ LiCn + 


E 

^Gr\{ 5 *‘-iU{n}} 


a^a^Xf 

l!aJ|2 


a„ 2 


( 8 ) 


Using the proposed MAP-MP algorithm performs the 
hypothesis test with two hypotheses corresponding to x„ = 0 
and Xn = 1, respectively, as follows: 


no 


Hi 


= E 


aja^ 


-xe ■ 




z'^ = II a„ 


hXn 


|a„||2 


alaf 


fGr\{5'^-iU{n}} 


|a„||2 


-Xi 


|a„||2 


( 9 ) 


( 10 ) 


where Ho is the null hypothesis such that the nth column 
vector a„ is not the support, i.e., x„ = 0 (n ^ T), and Hi is 
the alternate hypothesis indicating that the nth column vector 
is a non-zero support and the corresponding signal value is 1, 


i.e., x„ = 1 (n G T). These two hypotheses in 0 and 
involve multiple levels of randomness, namely, 

1) The randomness associated with the inner product be¬ 
tween two distinct vectors ||jp^ (unit norm) and ap, 
this is distributed as a Gaussian random variable, i.e^ 

^ Af (O, for £ n as shown in Lemma jlj 
(See Appendix). 

2) The randomness associated with the effective noise 

T 

; this is Gaussian with zero mean and variance 

I|a7i||2’ ^ 

T 

i.e., ~ Af (O, (T^,), as w is isotropically distributed 

in " 

3) The randomness associated with the sum 

of independent Gaussian random variables, 

is also Gaussian with zero mean and variance 

dfaj , a^w 


E 


(^k)2 _ K-(k-l) 2 

[Z'nJ ~ M 


are mutually independent Gaussian random variables 
for £ ^ j. 

4) The randomness associated with the norm of the col¬ 
umn vector ||a„|j 2 ; this is a scaled Chi-distribution 
with M degrees of freedom, i.e., /||i^| 2 (a:) = 




r(f) 


as shown in Lemma 


Using these facts, the conditional distribution of z^ under 
the null hypothesis is given by 


P(z^|x„ = 0) 




( 11 ) 


where CTo = y ^ m Similarly, under the hypothesis 

of x„ = 1 and |la „||2 = u, the conditional distribution of z^ 
is Gaussian with mean u and variance + cr^, i.e.. 


P (z*|x„ = l, ||a „||2 = u) 



2af ) 


ai\f^ 


( 12 ) 


where ui = 


/ K-(k-l) + l 2 

V M 


From Lemma 


by 


marginalizing the conditional distribution in (12 1 with respect 
to u, we obtain the conditional distribution under the hypoth¬ 
esis of x„ = 1 as 


P(z^K = l) =E||.„II, [P(z^|x„ = l,||a„||)] 

~Jo r(M) 

( 13 ) 


This conditional distribution is intractable to analyze due to the 
integral expression. Applying Jensen’s inequality, we obtain 
a lower bound of the conditional distribution function in a 
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closed-form as follows: 

exp 

P(z*|a;„ = l) > - 




2<7f 


> 


a\\f^ 

) 


exp I 


CTl 




exp 




CTl 




Once index is selected, MAP-MP estimates the new sparse 
representation using the updated support set = 5^“^ U 
{J^}. Since the signal is assumed to be a binary, the new 
sparse representation is set to be one, namely, 

= 1 . ( 18 ) 

Lastly, to remove the contribution of x^j,, we update the new 
residual signal such that 

r"- = y-$|5<.x|.. (19) 


(14) 

B. Remarks 


where the hrst and the second inequalities follow from the 
facts that e~^ and (a — x)^ are convex functions with respect 
to X for any a, respectively. The last equality is because 


shown in Lemma 




From 


]E[||a„|| 2 ] = Y M r(“) 

Lemma it is shown that this lower bound becomes tight, 
as the distribution of ||a„|j 2 converges to its mean value 

r—p/ i+M t 

limM-foo \ ^(M\ = 1 almost surely. As a result, for 

V ^(“2") 

large enough M, the conditional distribution under the hy¬ 
pothesis of a;„ = 1 is simply approximated as 


'{Zn\Xn 


= 1)^ 


1 






exp - 


-II 


2a\ 


( 15 ) 


Leveraging the conditional probability density functions in 
(111 and (15 I, the MAP ratio for a given observation is 


A (2 


= In 


\n€r\zt) \ 

'{niT\z^)j 

"(4|ner)P(ner) 



where (a) follows from the Bayes’ rule and (b) comes from 
the assumption that the K non-zero supports are uniformly 
distributed from 1 to N. This log likelihood ratio value carries 
reliability information about how the nth column vector in the 
sensing matrix is likely to belong to the true support in the 
fcth iteration. Accordingly, at iteration k G {1,..., iT — 1}, the 
proposed MAP-MP algorithm selects index that maximizes 
A (z^), namely. 


= arg max A(z^) 

nG[l:Af] 


= arg max 
nGfl:^ 




K-k+l 

M 




(4-1)^ 

M ^ 


(17) 


To obtain more insight on the proposed support detection 
method, it is instructive to consider certain special cases. 

Noise-Free Case: Let us consider the case of noise-free 
compressive sensing, i.e., = 0. The log-MAP ratio boils 

down to 


A(z^) 




2{K-k+l) 

1, fK-k + 1 

- In - 

2 \ K-k 


- 1)^ 
2{K-k) 



K 


N-K 


( 20 ) 


This expression clearly shows that the MAP ratio in the fcth 
iteration is a function of the relevant system parameters-the 
dimension of the measurement vector M and the sparsity 
level K. One key property of the proposed algorithm is that 
it updates the log-MAP ratio adaptively, since the variances 
of the conditional probability density functions decrease under 
the premise that the algorithm successively estimates the signal 
at each iteration. For the noise-free case, in the last iteration 
k = K, we slightly need to modify the computation of the 
ratio, as P [z^\n € T) =1. Accordingly, the modihed ratio 
in the last iteration for the noise-free case is given by 


HZn) = 


M{z^f 


In 


K 


N-K 


( 21 ) 


High Noise Power Case: Let us consider the high noise 
power scenario, i.e., cr^ ^ 1^*^ case, the MAP ratio in 

is approximated as 


A(^^) 




(4 - 1)^ 


24-1 

2^2 


( 22 ) 


From this, we are able to observe that the selection of the 
largest index of the MAP ratio is equivalent to the selection 
of the largest index of the correlation value in the high 
noise power regime, namely. 


arg max A (z^) = arg max z^. 


n 


n 


(23) 


Therefore, the conventional support detection methods that 
select the largest correlation value z^ is the optimal in the 
sense of the MAP detection strategy for the high noise power 
regime. For the cases of low noise power and noise-free, 
however, the selection of the largest absolute value of z^ for 
the support detection is not optimal. This fact clearly exhibits 
the benehts of the proposed MAP-MP against the conventional 
OMP algorithm in fT^ . 
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C. Asymptotic Analysis for Exact Recovery 

In this section, we derive a lower bound of the required mea¬ 
surements for the exact support recovery when the proposed 
MAP-MP is applied for the binary sparse signal. Unlike the 
prior analysis approaches that rely on the Restricted Isometry 
Property (RIP) Q, fTT) , or an information theoretical 
analysis tool in we directly compute a lower bound of 
the success probability that the proposed algorithm identifies 
the AT-sparse binary signal within K number of iterations. 
Utilizing this, a lower bound of the required measurements 
is derived to reconstruct the signal perfectly as the signal 
dimension approaches infinity. The following theorem shows 
the main analysis result. 

Theorem 1. The proposed MAP-MP algorithm perfectly re¬ 
covers a K-sparse binary sparse vector, x G {0,1}^, with M 
noisy measurements within K number of iterations, provided 
that the number of measurements scales as 

M = 0{{K + dl)\n{N)), (24) 

when N and K go to infinity. Here, cr^ denotes a normalized 
noise variance defined as 

Proof: Without loss of generality, we assume that the first 
K columns are the true supports, i.e., a;„ = 1 for n S [1 : K], 
i.e., T = {1,2, ...,Ar} and the remaining N — K columns 
are the zero supports. Furthermore, we denote to be the 
success recovery probability event in the /cth iteration. Then, 
the success recovery probability of the AT-sparse signal within 
AT number of iterations is given by 

p. = P 

= P(a 1)P(A;2|A{) X ■ ■ ■ X P(Af lAf-i,.. (25) 


where the equality comes from the probability chain rule. To 
prove that Pg approaches one asymptotically as oo, 

it suffices to check that the algorithm correctly identifies the 
column of the true support in the fcth iteration conditioned that 
all the prior iterations recover the true supports successfully, 
i.e., P(Ag ..., El) = 1 — 0 (;^) as N ^ oo for any 

fc G [1 : K], 

To detect the support correctly in the fcth iteration of the 
proposed algorithm, the maximum of A(z*) for i G T\S^ 
should be larger than the maximum of A{z^) for n G T'^ = 
{K + 1,..., N}, which is 


P(A;; 


k I 


.El)=^ 


max A (zf) > max A 
eGT-XSic ^ ' n(zr‘= ' 


By selecting an arbitrary element of £ G T\S^, a lower bound 
of the success probability in the fcth iteration is given by 


nE’:\E>:-\...,Ei)> 


A iz^) > max A (z*) 


N-K 


= n P [A(4) > A(z^)] 

n—1 

= (l-P[A(z,")<A(z^)])'^-'", (26) 


where the first equality follows from the fact that 
{A{z^^f},... ,A(z^)} are mutually independent as 


{^^+n ■ ■ ■ are IID Gaussian random variables with zero 
mean and variance CTq. To this end, we need to compute the 
probability that A(z^) is less than A(z^) as follows: 


P[A(z^)>A(z,^)] 

_pr(4)^ (4-1)^ , {4? 

2crQ 2(j\ 2crg 


(4-4 1 

2cr^ 






1 

to 

1 

1 

M 

to 

1 



< minE 

\ 2af j 

e ^ / 

E 

e 1 ) 

A>0 



(27) 


where the last inequality follows from Markov’s inequality and 
the independence of z^ and z^. Since z^ given xn — 0 
distributed as in Q, the first term in (|Z7|i is calculated as 


E 




''0 e 


(t-ip \ 
■ 2.2 ) 


■s/^CTO 


-df 




O-Q 


1-A 


(28) 


Similarly, using the distribution of z^ given = 1 in (12i, 
the second term in p7]l is computed as 


E 


-A 




(t-i) 

f°° e ^"1 e 


-A 


(t-l)" 




-dt 


g 2 A (. 2 _„ 2)_,<,2 


CTl 


A 


l-A 


(29) 


Plugging A = I > 0, the probability that the MAP ratio under 
the zero support is greater than that under the non-zero support 
is upper bounded by 


[A(z^) > A(z,4] < T 


, 2(cr^ + (7f) 


£1. 4_ ^0 "i 
2V(7 o (Tl^ 


(30) 


Since ctq = 


K-k-i-l-VM, 


and af = 


K-k-VM, 


M — M 

iteration, this error upper bound is further simplified as 


in the fcth 


[A(z^) > A(z,4] < 


^‘2{2K-2k + 2a^+l) 


_ 

2Vif-fe+l+5-2 


K-k+al, 


K-k+l+al, X 
K-k+al, ) 


^ g 2(2/<r-2fc+2c^^ + l) 


( 31 ) 


Plugging (31 1 into (26 1 , we have a lower bound as follows: 

N-K 

■ (32) 


F{El\El-\...,El) > 1 _ e^(2K-2k+2al+l 


From ( |3^ , we observe that the success probability in the first 
iteration is lower than that of any other remaining iterations. 


























































6 


i.e., P(i5s) < f‘{E^\E^ ^. ,El) for Vfc. It follows that the 
lower bound of the exact recovery probability is 

Ps = nEl)nEl\El) X ■ ■ ■ X P(£;f lSf-1,..., El) 

/ -M \ K{N-K) 

> 1 _ g2{2iC-l + 2£r2) ^ (33) 


Assuming M = (1 + e)2(2Ar — 1 + 2(f^) ln(A'(A^ — K)), the 
lower bound is rewritten as 


ln(PA > K(N-K)\n{l -^ | . 

^ ^ ^ {K{N-K)Y+^) 

Let K = 5N for a positive ^ > 0. Then, as N goes to infinity, 
we have 


lim ln{Ps)> lim A^^(5(l —5)ln11 — 


1 


N—>-oci 


N—^oo 


{Af2j(l-<5)} 


l+e 


= lim 
N- 


4(l + e)J(l-J) 
foo {Af2^(l-5)}i+'^ - 1 


= 0 , 


(34) 


where the second equality follows from L’Hospital’s rule. Con¬ 
sequently, we conclude that limjv->oo Ps = 1- From the facts 
that N > M > 2K (the condition for a unique sparse solution) 
and hi{K(N — K)) = ln(A^ —Ar)+ln(iT) < 21n(A^—Ff), it is 
possible that the AT-sparse binary signal is perfectly recovered 
within K number of iterations, if the number of measurements 
scales as, at least, M > (l-|-e)2(4Ar —2 + 4i7^) ln(Af — AT) for 
some e > 0. Therefore, the scaling law of the required number 
of measurements becomes M — O ((AT + tf^ ) \n{N)), which 
completes the proof. ■ 

Theorem 1 shows the statistical guarantee of the proposed 
MAP-MP algorithm for the binary signal. The guarantee is that 
the proposed MAP-MP algorithm recovers the AT-sparse bi¬ 
nary signal perfectly with K number of iterations, if the num¬ 
ber of (noisy) measurements scales as O ((AT -f tf^) In(A^)). 
This measurement scaling law clearly exhibits that the required 
measurements should increase with the sparsity level AT and 
the normalized noise variance This result backs the 
intuition that the measurements should increase the sparsity 
level and noise variance linearly. Meanwhile, the requirement 
measurements increase with N logarithmically. This condition 
extends the existing statistical guarantee for OMP proven in 
pO) by incorporating noise effects. 


IV. MAP-OMP 

In the previous section, we have proposed a new sparse 
signal recovery algorithm, assuming a binary sparse signal. In 
some applications, however, the component of the non-zero 
support can be an arbitrary value drawn from a continuous 
probability distribution In this section, we present a 

modified MAP-MP algorithm for the sparse signal whose non¬ 
zero element is distributed according to a distribution (u), 
which is refeiTed to as MAP-OMP. In contrast to the MAP-MP 
algorithm, MAP-OMP uses an orthogonal projection method 
when the unknown elements are estimated, which causes 
estimation errors. Therefore, it is essential for characterizing 
statistical properties of the estimation errors in each iteration in 


order to apply a hypotheses test. The following lemma shows 
the statistical properties of the estimation errors. 

Lemma 4. Let estimate 

o/x| 5 fe. Then, the mean vector ana the covariance matrix of 
estimation error, = Xj^t — Xj^fc, are 

E[eJ = 0 and E[e.e^] = (35) 

Proof: See Appendix [P] ■ 

Using this lemma, we explain the MAP-OMP algorithm. 

Let Xi be the estimate of Xi in the (A:—l)th iteration where 
i G S^~^. Then, the residual vector is 

= y- #5fc-iX5fc-i 

= ^ a^x^ -I- ^ a,ei -f w, (36) 

£er\S>‘-^ 


where Ci = Xi — xi denotes the estimation error by the orthog¬ 
onal projection. To identify the support element, the MAP- 
OMP algorithm performs hypothesis testing by computing the 

4. a^r'=“'- 

correlation value = -rp—n— as 




Ui 


z'" = 


E 


llanib 


-Xi ■ 


E 


I an II 2 


|an ||2 




\2Xn ■ 


E 

rGr\{ 5 ''-iU{n}}" 


ala^ 


-Xi 


E 

iGS’‘- 


T 

ai a,; 


T 

a„w 


(37) 


where Xn is distributed as fx„{u). The exact characterization 
of the distribution for under the null hypothesis is chal¬ 
lenging, as it highly depends on the signal distribution f^^ (u). 
To facilitate simplified calculations, the distribution of is 
approximated using Gaussian distribution with the first and 
the second order moments matching. From Lemma recall 
that and are distributed as Mlf), ) and 

= cr? for 


A/’jO, cr^). Furthermore, 


E[a:^] = p and E[a:^] 


t gT, the first and second moments of are 


E[z 


= 0]= E E 

rGr\5'“-i 

+ E® 

= 0 


T 

a ai 


E[x£] 


T 

ai a. 


E[e 


-E 


T 

ai w 


(38) 


and 


E [(z^)2 I = 0] = 


T 

ai a, 


(AT—fc-1-1) cr, 

' M ’ 

{K-k + l)a1 + al 
M 


E 

^G7\S'=-i 

2 ] 

2 


E 


E[e;]-fE 


lla„|!2 


T 

ai w 


E[xl] 


(fc-1) al{K -k + l)+al 


1-f 


M-k-l 

fc -1 

M-k- 1 


M 


(39) 
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where E[ef] = from Lemma |^ Accordingly, 


TABLE I 

MAP-OMP Algorithm 


M-k-l _ 

the approximated distribution of z* is given by 

1 / 

' (z^|a::„ = 0) ~ —7^ exp 


2^2 


(40) 


I - 

aov 27r 

ao = y Similarly, con¬ 

ditioning the hypothesis of Xn = u, the approximated distri¬ 
bution of is given by 


where an = 


1 


Xn = U ~ 


2^2 

where cti = ^ ■ Utilizing 

the approximated distributions in and (E3’ the log- 

MAP ratio is obtained by marginalizing with respect to the 
distribution fx^{u), namely. 


\zt-u? 


(41) 


1) Initialization: 

fc := 0, X® = 0 


:= y (the current residual) 

70 


<5^ :={0}. 


2) Repeat until a stopping criterion is met 

i) fc := fc + 1. 

ii) Compute the current proxy: 

= Ik. II2 s 

iii) Select the largest index of MAP ratio: 

=: arg maxn {Ati( 2 :^)} for d G {f/, C, G}. 

iv) Merge the support set: 

gk ^ gk-l LI Jk 

v) Update sparse signal: 




'isfcx- ylb- 


vi) Update the residual for next round: 


:= y _ # 




A (zn) ZZ In 


' iZo {^n\Xn = u) f:^Ju)du^ 

F{z^\Xn=0) 


+ln 


K 


N-KJ 

(42) 


It is worth noting that when \C\ = 1, this MAP ratio 
approximation in ( |45| l recovers the exact MAP ratio for the 
binary signal case given in ( [T6| l. 

Example 3 (Gaussian Signal); Assuming = 


Therefore, the proposed MAP support detection for the non¬ 
binary signal is to select the support index such that 

arg max A 

«G[l:Af] ^ ' 

, f {Zn\Xn = u) f,^{u)du\ 

~ arg max In - ^ , --- . (43) 

^nG[l:iV] I ¥{z^\Xn=0) j 


2rr^ 




, the log-MAP ratio simplifies to 


Aa(z^)=zln‘ ^ 


Zn X„ = U 




e du 


'{z'^\x - 


= 0 ) 




exp 


, 2(CT2+a2) ^ 


a more transparent interpretation of the 

-m 

1 / 


(43 1 , we consider the following three cases of 

V 

nAFso V 

2-^0 ) 


interest. 

Example 1 (Uniformly Distributed Signal): One basic case 
is the scenario where the elements of the transmit signal are 
drawn from a uniform distribution between 0 and 1, i.e., 
fx„(,u) = 1 for 0 < M < 1. Then, the MAP ratio expression 
in becomes 


iz^nV 


2^0 


+ ln 


O-Q 


2(o'^ + o'i) [ ^/2Ha^+df) 


(46) 


A, 


U [Zn 


~ In 


y/TF 

2 


Erf 



(44) 


where Erf (a;) = ^ /“ e" 


'/2TzaQ 

■’-UdL 

Example 2 (Sparse Signal with Finite Alphabet): Another 
popular case of interest is one where the non-zero entry of 
X is uniformly selected from the elements of a finite set 
of alphabet C = {ci,...,cq} as considered in p^ , p^ . 
For example, each pixel of a bitmap image file is capable 
of storing 8 different colors when the 3-bit per pixel (8bpp) 
format is used. In this application, the finite set can be given 
as C = {0,1,..., 7}. In this case, the log-MAP is computed 
as follows: 


Ac 


/ P Zn\Xn = Ci) P[a;„ = d] 




= 0 ) 


= In 


J_ Y^|C| 

|C| Ki=l 


|C| 


\/2-kg\ 


47 exp 




\/27r5’o 


exp 


V 2^0 ) 


(45) 


In the case in which the signal is distributed as zero-mean 
Gaussian, i.e., E[a;f] = 0, the algorithm selects the index 
that maximizes Ac (z^) = 

the same selection criterion used in the conventional OMP 
algorithm; thereby, there is no benefits of using the proposed 
method compared to the OMP algorithm. Whereas, when the 
Gaussian signal has a non-zero mean value, i.e., E[a:f] ^ 0, 
the proposed algorithm provides a better support detection 
probability than that of the conventional OMP algorithm. 

Using theses approximated log-MAP ratio functions in the 
examples, we provide the MAP-OMP algorithm, which is 
summarized in Table [I] The key difference with the MAP-MP 
algorithm for the binary signal is that MAP-OMP computes 
the MAP ratio differently depending on the sparse signal 
distribution. Furthermore, the algorithm estimates the sparse 
signal using a least square solution in each iteration similar to 
the conventional OMP algorithm. 

V. Extension to the Other Greedy Algorithms 

One advantage of the proposed MAP support detection 
method is, indeed, directly applicable to many other greedy 
sparse signal recovery algorithms. In this section, we provide 
a set of greedy sparse signal recovery algorithms that exploit 
the proposed support detection method. 



















































TABLE II 

MAP-gOMP Algorithm 


TABLE III 

MAP-CoSaMP Algorithm 


1) Initialization: 


1) Initialization: 

A: := 0, x“ = 0 


fc := 0, X® = 0 

r® := y (the current residual) 


;= y (the current residual) 

S° := {0} and 0° := {0} 


S° := {0} and 0.° := {0} 

2) Repeat until a stopping criterion is met 


2) Repeat until a stopping criterion is met 

i) A: := A: -1- 1. 


i) Compute the cun'ent proxy: 

ii) Compute the current proxy: 

aT_fc-l 


for»r6[l:Af]. 

“ I^a,^ll2 for’T’e [1 : A^]. 


iii) Select the 2K largest indices of MAP ratio: 

iii) Select the L{< largest indices of MAP ratio: 


=: argmax|Qfc|^2x for d S {U,C,G}. 

0*= =: argmax|f,fe|_^ for d e {U,C,G}. 


iv) Merge the support set: 

iv) Merge the support set: 


gk ^gk-i 

gk ^ gk-i uQfc, 


iv) Perform a least-squares signal estimation: 

v) Perform a least-squares signal estimation: 


X|5fc =: argminx |l^| 5 fcX - y|| 2 , X|5fcc = 0. 

xj'sfc :=argminx|!#| 5 fcX-y|| 2 . 


v) Prune x^: 

vi) Update the residual for next round: 


g =: argmax|g|=^ {|x''|}, 



vi) Update the residual for next round: 
r'==y-^ig%. 



A. MAP-gOMP 

gOMP p7| is a simple yet effective algorithm that im¬ 
proves the performance of OMR The key idea of gOMP 
is the selection of multiple support indices with the largest 
correlation in magnitude at each iteration; thereby, it re¬ 
duces the mis-detection probability compared to that of OMP 
Similar to the gOMP algorithm, MAP-gOMP is a greedy 
algorithm that sequentially finds multiple support indices and 
estimates the signal representation within a certain number 
of iterations. The core difference lies in the selection rule 
of the support indices per iteration. Unlike the gOMP al¬ 
gorithm, MAP-gOMP chooses L support indices with the 
largest log-MAP ratio values instead of the largest correlation 
in magnitude. Therefore, in the fcth iteration, we update the 
variances of two conditional distributions in (H^ and iD 


as (Tn 


_ (K-L{k-l))al-\-al , 

“ M 


TABLE IV 

MAP-SP Algorithm 


M-Ltk-l)-2 ) and CT? = 

(K-Lk)^al+al ^ Lik-i) .^jK-L(k-i)Ha^ ^he proposed MAP- 
gOMP is summarized in Table [I^ 

B. MAP-CoSaMP 

CoSaMP is an effective iterative sparse signal recovery 
algorithm It was shown to yield the same sparse signal 
recovery performance guarantees as £i-norm minimization 
even with less computational complexity. The main idea of 
CoSaMP is that, in the hrst step, it estimates a large support 
set with L largest correlation values in magnitude and obtains 
a least square solution based on it, where L is typically chosen 
between K < L < 2K. In the next step, the algorithm reduces 
the cardinality of the support set back to the desired sparsity 
level of K using pruning, and acquires a sparse solution again 
based on the reduced support. 

We modify this algorithm by incorporating the proposed 
support detection technique. Unlike the conventional CoSaMP 
algorithm, MAP-CoSaMP adds 2K support candidates with 
2K largest MAP ratio values to the support set 5^' per 
iteration. Once the least square solution is obtained based on 
the corresponding support set an approxi¬ 

mation to the signal is updated by selecting the K largest 
coordinates using pruning. Finally, the residual is updated 
using the approximated signal estimate. The algorithm is 


1) Initialization: 

k := 0, = 0 


:= y (the current residual) 
50 { 0 } and QO — { 0 } 


2) Repeat until a stopping criterion is met 
i) Compute the current proxy: 

for n G [1 : TV]. 


T k-l 


iii) Select the K largest indices of MAP ratio: 

Qfe =: argmax|f;fc|^^ for d S {U,C, G}. 

iv) Merge the support set: 

gk ^ ^fe-i uofe. 

iv) Perform a least-squares signal estimation: 

b'= := argminb - y||2 

v) Select the K largest index in 

g =: argmax|g|^^ {|b'=|} 

vi) Perform a least-squares signal estimation using the updated Q\ 

x*. := argmirix Ij^’igx - y||2. 

vii) Update the residual for next round: 

r"=y-^ig^ig- 


described in Table III The computational complexity order 


of the proposed MAP-CoSaMP is the same with that of the 
original CoSaMP algorithm |T8|. We refer |T8), 1^ for the 
reader who are interested in the computational complexity 
analysis of CoSaMP. 

C. MAP-SP Algorithm 

SP is a two-step iterative algorithm for sparse recovery 
Similar to CoSaMP, the SP algorithm identifies the current 
estimate of support set by greedily adding multiple indices 
with the largest correlation in magnitude.The main difference 
between CoSaMP and SP lies in the second step. While 
CoSaMP applies a pruning technique using the estimated 
sparse signal in the hrst stage to maintain the required sparsity 
level without performing the second least-square estimation. 
Whereas, the SP algorithm updates the sparse solution by 
solving a least square problem based on the reduced support 
in the second stage. 

Applying the proposed MAP support detection method, we 
modify this algorithm by changing the support set identih- 
cation stage. The proposed MAP-SP algorithm selects 2K 
support indices with the largest MAP ratio values in each 
iteration. The modihed algorithm is summarized in Table IV 
Since the log-MAP ratio computation does not increase the 
computational complexity order, the proposed algorithm can 
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Fig. 1. Performance compaiison of perfect reconstruction probability for the 
binary signal with noise-free measurements. 

be implemented with 0{MNK), which is comparable to that 
of the SP algorithm in 

VI. Numerical Results 

We provide empirical recovery performance of the proposed 
algorithms by means of simulations. We evaluate the empirical 
frequency (cumulative density function) of exact reconstruc¬ 
tion for the proposed algorithms in both noise and noiseless 
cases and compare them with the conventional algorithms. In 
our simulation, we generate M x N (M = 128 and N = 256) 
sensing matrix whose elements are drawn from IID Gaussian 
distribution Af{0, jg). Furthermore, we consider iF-sparse 
vector X whose support is uniformly distributed. Each non¬ 
zero element of x is one for the binary signal and is randomly 
selected from [0,1] for the uniform signal. To obtain the 
empirical frequency of exact reconstruction, we perform 1,000 
independent trials for each algorithm. For each trial, we per¬ 
form iterations until the stopping criterion ||x —xHl < 10“^^ 
is satisfied except for gOMP and MAP-gOMP. For gOMP and 
MAP-gOMP, we perform min [K, \Jj ;\) number of iterations 
in each trial, where L = 2. To obtain the performance of BP, 
we use the CVX tool that can be executed in MATLAB f^ . 

Fig. [T] illustrates the reconstruction probability performance 
of a binary sparse signal with noise-free measurements as a 
function of the sparsity level K of the signal. The simulation 
results reveal that the proposed algorithms improve the re¬ 
construction probability performance signihcantly compared 
to those of the existing algorithms. For example, the proposed 
MAP-gOMP recovers the binary sparse signal with more than 
90 % probability up to a sparsity level of 42. Whereas, the 
conventional gOMP is able to reconstruct the signal only up 
to a sparsity level 31 under the same reconstruction probability 
constraint. Furthermore, the proposed MAP-gOMP, MAP- 
CoSaMP, and MAP-SP algorithms outperform BP, i.e., a linear 
programing technique, for the binary signal reconstruction. A 
non-negative BP algorithm that solves the £i-minimization 
problem with an additional non-negative constraint in x. 



Fig. 2. Performance comparison of perfect reconstruction probability for the 
signal whose non-zero element is uniformly distributed between 0 and 1, i.e., 
Xi ~ Uni[0,1] with noise-free measurements. 

however, provides a better performance than the proposed 
algorithms at the expense of a more computational complexity. 

Fig. HI shows the reconstruction probability of a sparse 
signal whose non-zero element is uniformly distributed be¬ 
tween 0 and 1, i.e., Xi ~ Uni[0,1]. We use the MAP ratio 
function in ( |44l l for the simulations. Similar to the binary 
signal case, it is no wonder that the proposed MAP-gOMP, 
MAP-CoSaMP, and MAP-SP algorithms outperform than the 
existing sparse recovery algorithms by considerably reducing 
the mis-detection probability of supports. In particular, MAP- 
gOMP and MAP-SP are able to recover the signal with more 
than 95 % probability up to a sparsity level of 60, which 
is close to the maximum sparsity level (^ = 64) that can 
be recovered with a unique solution guarantee. In particular, 
MAP-SP outperforms than the non-negative BP algorithm. 

We consider now a binary sparse image recovery example. 
As illustrated in Fig. (the left-top figure), a binary sparse 
image with 37 x 37-pixel size is considered for the experiment. 
Applying linear random projection matrix $ S ]^685xi369 
whose elements are drawn from N(Q, we compress 
the binary image. As shown in Fig. when the noise-free 
measurements are used for image (supports) reconstruction, 
we observe that the proposed MAP-gOMP and MAP-SP 
algorithms for the support recovery outperform than the other 
existing algorithms, which agrees with the result shown in 
Fig.E We add Gaussian noise with zero mean and variance 
= 0.005. In this case, as depicted in Fig. the proposed 
MAP-SP method is able to recover the image almost perfectly 
even in the presence of noise. Whereas, the image reconstruc¬ 
tion performance of the proposed MAP-gOMP algorithm is 
degraded compared to the case of noise-free, which exhibits 
the noise sensitivity of the algorithm. 

As can be seen in Table [V] the proposed algorithms achieve 
significant speedup compared to the existing algorithms in 
both the noise-free and noisy measurements cases. These 
speedup gains are mainly due to the fact that the proposed 





















10 



Fig. 3. Support recovery performance comparison of the sparse binary image 
reconstruction with the compressed and noise-free measurements, i.e., = 

0. All different algorithms use the same random linear projection matrix for 
image reconstruction. 


algorithms identify the true support set with small number of 
iterations, leading to the faster convergence rates than those of 
the existing algorithms. In particular, the runtimes of MAP-SP 
(~ 0.21 sec) under the noise-free measurements speed up 157 
times than that of BP (~ 33.22 sec). 

To provide the insight on how the performance of the 
proposed algorithm decreases as the noise variance increases 
for given K, M, and N, we plot normalized mean squared 
error (NMSE) of the proposed algorithms as a function of 
signal-to-noise ratio SNR = 


114'xll 


which is dehned as 


NMSE = 10 log ,0 ’ (47) 

where T is the number of trails and the subscript i represents 
the trial number. In each random trial, a random gaussian 
matrix $ e M128X256 is generated, and the non-zero elements 
in X are generated as Gaussian random variables with mean 
one and variance P = jp. We assume that sparsity level 
K — 40. For this noise case, we slightly modify the least- 
square signal estimator used in each algorithm such that 

^5*= “ + SNR^) 

As illustrated in Fig. the algorithms using the proposed 
MAP support detection method outperform the conventional 
sparse recovery algorithms. This reveals that the proposed 
algorithms are robust to measurement noise. Interestingly, the 
proposed algorithms including MAP-gOMP (L = 2) and 
MAP-SP exhibit a better NMSE performance compared to that 
of FBMP in 


VII. Conclusion 

We have presented a new support detection technique 
based on a MAP criterion for greedy sparse signal recovery 
algorithms. Using this method, we have proposed a set of 
greedy sparse signal recovery algorithms and established a 
theoretical signal recovery guarantee for a particular case. One 
major implication is that the joint use the distributions of 
sensing matrix, sparse signal, and noise in support identifica¬ 
tion offers a tremendous recovery performance improvement 


Fig. 4. Support recovery performance comparison of the sparse binary image 
reconstruction with the compressed and noisy measurements, where cr^ = 
0.005, equivalently = Ma^ = 3.425. 

TABLE V 


Algorithms 

Runtimes^S^ 

0-2=0 

0-2=0 

IMKmtimes 

0-2 = 0.005 

Speedup 

0-2 = 0.005 

gOMP 

5.03 

6.6x 

5.03 

7.1x 

MAP-gOMP 

2.26 

14.6x 

4.81 

7.4x 

SP 

12.97 

2.3x 

14.5 

2.5x 

MAP-SP 

0.21 

157.8X 

15.1 

2.4x 

BP 

33.22 

baseline 

35.73 

baseline 


over previous support detection approaches that ignore such 
statistical information. Our numerical results demonstrate that 
the greedy algorithms with highly reliable support detection 
provide significantly better sparse recovery performance than 
the linear programming approach. 

An interesting direction for future study would be to explore 
the statistical guarantees of the proposed MAP-gOMP, MAP- 
CoSaMP, and MAP-SP. Another possible research direction is 
to investigate the greedy algorithms when different statistical 
distributions of the sensing matrix are used. Furthermore, it 
would be interesting to apply the proposed support detection 
principle to improve the sparse signal reconstruction method 
in (HI- 


Appendix 

A. Proof of Lemma 

Note that the distribution of each atom vector a„ is ro- 
tationally invariant. This implies that for any unitary matrix 
U G , the distributions of Ua„ and a„ are identical. 

By selecting a unitary matrix U so that Ua„ = [1,0,..., 0]^, 
we can compute the cumulative distribution function of 
as 


[ / 1 

— ID) 

[l!aJ!2^ J 

II 1 


a™ 2 


-U^a^ < X 


(48) 


where af(l) denotes the first component of a.g. As a result, 

T 

if" is IID Gaussian with zero mean and variance fr. 


M- 


B. Proof of Lemma 

Recall that all elements of a„ are Gaussian random variables 
with zero mean and variance and they are mutually 
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Fig. 5. The NMSE performance comparison among different sparse signal 
recovery algorithms when T = 1000. For FBMP, we use D = 20, which is 
the maximum number of allowable repeated greedy searches. 


than or equal to a small value e, which is 

P[|||a„||2 - E[||a„||2]| > e] 



> e 


lA. ( 

M I r(M) 


(52) 


where the inequality follows from Chebyshev’s inequality. 
Since ^ J converges to one as M goes to inhnity, 

we conclude that 


independent. Thus, 


|a„||2 <x]=: 


M 


\ < 
\ 771—1 


(an(m)) 


(a) 


\ ^ 

^ 771=1 

( M Mx^ A 

V 2 ’ 2 J 

r(f) 


M 


< VMa 


(49) 


■^M (a7^(m))^ 


1/M 


IS 


where (a) follows from the fact that 

(a ('m')')^ 

Chi-distributed with M degrees of freedom, since ^ ijm 
is a normal Gaussian with zero mean and unit variance, and 
'y{s,x) = Jp denotes the lower incomplete gamma 

function. By taking the derivative with respect to x, we obtain 
the distribution of ||a „||2 as 


/||an||2 (^) “ 


1 

r(f) 

^ 1 Vf T, _ M n /f 1 i 

2^ 2 M 2 


r(f) 

Accordingly, the mean of the norm is 


(50) 


E[|la„||2] = 


-V1 A4 .. ^ Ad A/f 1 

^1- l\/r-!r nr^M - 


r(M) 


-da; 


M r(f) 

which completes the proof 


( 51 ) 


C. Proof of Lemma p] 

We commence by computing the probability that the abso¬ 
lute difference between the norm and its average is greater 


lim P[ 

M—¥00 


- E [||a„ 


> el = 0 


(53) 


for some e > 0. As a result, the norm of each column 
and it also converges to one for a large enough M because 

I -p/l + M'1 

limM-j-oo \ r( = 1- This completes the proof. 


D. Proof of Lemma 

Let P| 5 fc = ^ projection matrix to 

estimate in the fcth iteration. Using this, the correspond¬ 
ing non-zero elements are obtained as 

X|5. =P|5fcy 

= X|5fc + P|5fc (#|7-\5fcX|7-\5fc + w) . (54) 

Then, the mean of the estimation error is 

E[x|5fc—X|5fc] = E [P|5fc$|7-\5fcX|7-\5fc] -|-E[P|5few] 

= 0, (55) 


where the last equality follows from that all elements in 
P| 5 fe, $| 7 -\ 5 fc, X| 7 -\ 5 fc, and w are mutually independent and 
E[$| 7 -\ 5 fe] = 0 and E[w] = 0. Next we compute the error 
covariance matrix. Conditioned that the sub-matrix is 

hxed, the error covariance matrix is 


E 




(56) 


= P,cfc# 




.E 




^|T\S''Pis'' +P|5''E^Vw'^]P|gfe 


(a) 

= cr: 


^P|5fcE 


$ 


|r\5'^^ir\5'= 


Prc.+a^Pc.P,^ 




(&) 


^l{K-k) 2 

M 


)( 




-1 


where (a) is due to E[ww^] = cr^I and E 
(T^I and (b) follows from E 




K-k 

M 


I and 
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P| 5 feP| 5 )! = ■ Let and 4*|5fc be the ith column 

vector in and a submatrix obtained by eliminating 

hi in where i G 5^. The jth diagonal element of 

(^ 15 '= ^|5") is given by 


(^15 '■^ 15 '=) 


b^P-L 


(57) 


where = I - ^15'“ for the 

orthogonal projection onto the null space of Since all 

elements in and are assumed to be IID Gaussian 

random variables JV (O, Mb^P^ b^ is distributed as a 

I 

Chi-squared random variable with degrees of freedom M — k, 
i.e., MbfP^ feLi X^M-k)- ^ result, by marginalizing 

with respect to the Chi-squared distribution, we have the 
variance of the *th estimation error as 


E[(xi - Xif] 


( -k) 
\ M 

( -k) 

^ M 



1 E 

1 

My 

'-’i ^ # 



M 

M y 

M 

-k-2 


^l{K-k)+dl 

M-k-2 


(58) 


which completes the proof 
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